Swollen-Collapsed Transition in Random Hetero-polymers 
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ABSTRACT 



<N . 

A lattice model of a hetero-polymer with random hydrophilic-hydrophobic charges interacting 
with the solvent is introduced, whose continuum counterpart has been proposed by T. Garel, 
L. Leibler and H. Orland jr). The transfer matrix technique is used to study various constrained an- 
nealed systems which approximate at various degrees of accuracy the original quenched model. For 
highly hydrophobic chains an ordinary #-point transition is found from a high temperature swollen 
phase to a low temperature compact phase. Depending on the type of constrained averages, at very 
low temperatures a swollen phase or a coexistence between compact and swollen phases are found. 
The results are carefully compared with the corresponding ones obtained in the continuum limit, 
and various improvements in the original calculations are discussed. 
PACS numbers: 05.70.fh 61.41. +e 64.75.+g 75.10.nr 



I. INTRODUCTION 
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The main reason for the study of random hetero-polymers in solutions, is a possible connection of this problem 
with the protein folding problem ■ Indeed, proteins are believed to be by nature selected special cases of random 
hetero-polymers. Before dealing with these special cases, it is of great importance to understand the typical behaviour 
of the various kinds of random hetero-polymer models that have been introduced, as it may give important insight 
in which types of interactions are indispensible for folding, and which types of interactions, on the other hand, are of 
secondary importance. 



> 
OO 

Several models of (quenched) randomness have been considered. Here, we study the role of the solvent (water) 
in the equilibrium properties of the collapsed phase, as it is commonly believed that the hydrophobic effect || 
is the main driving force for the folding transition. Most proteins in nature consist of a strongly hydrophobic core, 
surrounded by hydrophilic (less hydrophobic) residues. We restrict ourselves to the simple coarse grained model, 
that was originally introduced by Obukhov |]6|, where the monomers of a single chain are randomly hydrophilic or 
hydrophobic (RHH), and interact with the solvent molecules through an effective two-body short range interaction. 
The statics of the continuum version of this model has been studied by Garel et al. , both in the case of annealed 
I i and quenched disorder, while the dynamics (with quenched disorder) has been studied by Thirumalai et al. The 
model has been studied also by Timoshenko et al. U and Moskalenko et al. with the Gaussian self-consistent 
method. 

We have choosen to study a (2d square) lattice version of the RHH model, and the method we used to assess 
the conformational entropy, is the transfer matrix (TM) method, which is most fit to study the case of annealed 
disorder. Furthermore, using the approximation scheme introduced by Morita [|ll], we are able to give lower bounds 
for the quenched free energy. It will turn out that the annealed case may exhibit a very rich phase diagram and 
re-entrant compact-swollen transitions, and we come to different conclusion than Garel et al. [Q. The case of the 
annealed average with fixed mean for the hydrophobic-hydrophilic charges, gives the same results one can get in the 



continuum limit for the quenched average using the method of reference j7j (see section VII for details). We go one 
step forward analyzing a better approximation to the quenched system which cures some problems present in the 
previous approximations and in the standard approach presented in 



we 



This work is built up in the following way. In section O, we introduce the model in more detail. In section III 
introduce the concept of constrained annealing; in section ^ we show that the effective models after averaging over 
the disorder involve 2— and 3-body interactions and the general phase diagram of such kind of models is discussed. 
In section [V|, the TM method is used to assess the conformational entropy of the polymer chain. The results are 
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presented in section VI, together with an outlook of the items that are still to be investigated. Finally, in section VII 
we give an interpretation of our results, and a detailed comparison with those obtained for the continuum model by 



II. DEFINITION OF THE MODEL 



The polymer chain is represented by a self-avoiding walk (SAW) on a lattice where each site is either visited by 
the walk (i.e. is occupied by one monomer of the chain), or occupied by a solvent molecule. The interactions in the 
model are two-body short-range interactions. The only interactions we take into account, are those between solvent 
molecules and monomers if they occupy nearest-neighbor sites. Hydrophilicities Xi are attached to each monomer i 
of the walk such that the Hamiltonian is given by 

N 

H = -^2\ i z i , (1) 

i=0 

where the sum runs over the N + 1 sites of the lattice occupied by the TV-step walk, and zi is the number of nearest- 
neighbor contacts of monomer i with solvent molecules, i.e. the number of nearest-neighbor sites of site i not occupied 
by the walk. 

The hydrophilicities A^ are supposed to be independent identically distributed random variables with a Gaussian 
distribution with mean Ao and variance A: 

and the average over this (a priori) distribution is indicated by ((■)) . If A; > 0, the corresponding monomer is 
hydrophilic and attracts solvent molecules, whereas if Xi < 0, the monomer is hydrophobic and repels solvent molecules. 
The canonical partition function for SAW of N steps with a fixed disorder configuration {Xi} is then 



z N ({Xi}) = j2 ex p(pJ2 XiZi J ' ( 3 ) 

W N \ t=0 ) 

where the sum has to be taken over all A-step SAW starting from the origin. 

If monomers can rearrange themselves along the chain and change their hydrophilicities, e.g. with chemical reactions, 
these have to be considered as thermal annealed variables, which approach equilibrium in the same time scale as the 
configurational degrees of freedom. The physics of such hetero-polymers is given by the average of the partition 
function over the disorder distribution (annealed average) 

»({Ai})» = £ exp PX $> + £y-£a? • ( 4 ) 

W N V i=0 i=0 I 

Instead, if the monomer sequence of the chain is fixed, as it is the case for proteins, the hydrophilicities are frozen 
while the polymer is approaching thermal equilibrium; the average over the disorder distribution has then to be taken 
over the logarithm of the partition sum (quenched average) fjjfl, to yield the quenched free energy 



/.,=-^<V^a<{A;|>]>) • ( •) 



which is a much harder task to accomplish. 



III. CONSTRAINED ANNEALING 

In order to avoid the difficult direct computation of the quenched average (||) , we have applied an idea first introduced 
by Morita p!l| . This is the so-called Equilibrium Ensemble Approach (EEA) (see e.g. |ll| for a recent review and 
discussion). The EEA consists of a systematic approximation procedure for the quenched free energy by annealed 
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averages. It can be shown JT3| that each successive approximation gives a better or equally good lower bound for the 
quenched free energy. 

Each approximation consists in performing an annealed average over a new Hamiltonian TL* = TL + TLd, where TL is 
the original Hamiltonian ([!]), and TLd is a fictitious disorder potential, which contains a number of parameters. These 
parameters (Lagrange multipliers) have to be tuned in such a way that some moments of the a posteriori (annealed) 
distribution of the disorder are equal to the a priori (quenched) ones. In annealed averages, the a posteriori distribution 
P*({Xi},{zi}) is defined as 



P(A,)exp(-/3W*({AJ,{z i })) 



(6) 



The average over this distribution will be denoted by (•) = J d{Xi}J2w N i z i})' ■ l n principle, one has to 

fix all the moments of -Pannd^i}) = Ylw { z i}) t° obtain the quenched result, which is as difficult as the 

direct computation of (j^). Nevertheless, one can hope to obtain a reasonable approximation of the quenched case by 
fixing a few suitably chosen moments. Moreover, the method is variational, and fixing more and more moments yields 
thighter lower bounds for the quenched free energy. 

In this work, we have considered three different cases of annealing: without constraints (ao), constraining the first 
moment of overall hydrophilicity (&i), constraining the first and the second moment of overall hydrophilicity (02). 
For all these cases, we obtain the same formal expression for the effective homo-polymer partition function 



Zjf = £exp 

W N 



N0 O + 1 J2 *i + 02 J2 • 



(7) 



and any further complexity is hidden in the computation of 0o, 0\ and 02 for the different cases. The strategy we will 
follow, is to study the general homo-polymer model defined by (Q), in the {J3\, /32)-plane (the 0q dependence being 
trivial). Then, we investigate to which temperature dependent trajectories in the /32)-plane, the three annealed 
averages give rise. 

In case (ao), the simple annealed average (^) has already been computed in the preceding section, and equation (|t]) 
is recovered with the definitions 



00 



0i = 0X O , 



02 



1 «2 \2 



p z y 



(8) 



We can immediately argue from equation (^) and (^) that even hydrophobic chains (Ao < 0) will be swollen at low 
enough temperature. Indeed, since 02 ^ \0i\ as — > 00, the repulsive 02 J2i z ? term causes the number of contacts 
with solvent to be maximized, independently of the sign and the value of Ao- 

In case (<xi), we fix the a posteriori overall hydrophilicity Xi/N to its a priori value Aq: 



N 



(9) 



We impose this constraint by defining a generalized partition function Z^({Xi}, h) which depends on the Lagrange 
multiplier h, and by finding the effective value h* for which (^ holds: 



^({AJ,M = E ex P 



w N 



i \ i 



We recover (Q) with the following definitions 

a 2 X 2 h 2 
Po = — ^ — , 



0! = 0X O - 2 X 2 h 



02 



NXn 



2 X 2 



(10) 



(11) 



In terms of h, condition (0) becomes 



N 



(12) 



and the free energy f ai is thus given by 
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(13) 



Note that the quenched free energy computed in the continuum model by Garel et al. |Q, is exactly the free energy 
fai ( p3[ ) , if one performs the annealed average with constraint fl) within the analytic calculation scheme of reference 
|jj . We will comment on this in the final discussion of section VII . 
In case (02), in addition to (^), we also put a constraint on the overall variance (Xj — Aq) 2 /N: 



N 



A 2 + A 2 



(14) 



In the same way as before, we introduce a second Lagrange multiplier I, and we define a generalized partition function 



Z^{{Xi},h, s) = ^exp 



w N 



f3j2*iZi-0h(j2*i-N\o) - y (]rA 2 -iV(A 2 + A 2 ) 



After performing the average, we recover (Q) with the following definitions 
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A) = I -^(^+Aos) 2 +/3A 2 s-lns' , /?■ 



f3X -(3 2 X 2 h 



2 \2 



~2~7 



s' = l+/3A 2 s 



Conditions (^) and (|l4|) yield two coupled equations for h* and s* with solution: 

h* + A s* = A 2 (/i*,s*) , 



1 + 4f3 2 X 2 (A 3 (h*, §*) - A 2 (h*, §*)) - 1 



2/3A 2 



(15) 

(16) 

(17) 
(18) 



where A2 = Zi) /AT and A3 = (y^ i zf) /N. In the next section, we will show that A2 and A3 are closely connected 
to effective 2-, respectively 3-body interactions between the monomers. The free energy f a2 is now given by 

(19) 



As mentioned before, it can be shown [l3]| that at any temperature f ao < f ai < fa 2 < •■• < fq- Hence, fixing more 
and more moments, we get a better approximation for the quenched free energy. 



IV. PHASE DIAGRAM 

We now discuss the phase diagram of the model defined in (0), in the (Pi, /?2)-plane, with fto constant: 



Zf = C 



H ex P 

W N 



(20) 



The number Zi of nearest-neighbor contacts of monomer i (not at the chain ends) with solvent molecules can be 
expressed in terms of the number n, of nearest-neighbor monomers of monomer i not along the chain using rii = 
2(d—l) — Zi, where we have exploited the incompressibility condition. In this way the quantities Zj and J^- zf 
can be related to effective 2-body and 3-body interactions between the monomers. We define N 2 to be the number 
of nearest-neighbor 2-monomer contacts not along the chain, while N3 is the number of nearest 3-monomer contacts 
not along the chain. On a hypercubic lattice in M d , we have a nearest 3-monomer contact when two monomers are 
both nearest neighbors (not along the chain) to the third monomer and we have that 



(E<n?-E<' 



(21) 



In terms of N 2 and A^, the reduced Hamiltonian can be rewritten 



off 



2(d-l)[/3i + 2(d-l)j3 2 ]N - [2/3i + 2(M-5)p 2 ]N 2 + 2/3 2 N : 



3 • 



(22) 
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Since fa > for the models that we have considered, the 3-body term is either attractive or absent. Note that 
fa enters also in the 2-body term with a repulsive effect; as already noted, the total contribution of the fa term 
in equation ( pp| ) has to be repulsive. We recall that the self-avoidance constraint automatically introduces effective 
n-body repulsive terms, with n = 2, 3, 4 and so on. 

For fa — 0, we have a self-avoiding walk with only 2-body interactions, each of energy 2fa/(3. For these models, the 
existence of a critical value fa < at which the chain undergoes a second order ^-transition, is well known [|l5] [L7[ . 
The transition takes place when the 2-body attractive interaction exactly balances the 2-body steric repulsion. For 
fa > fa, the chain is in the swollen phase, while for fa < fa, the chain is in the collapsed phase. The 0-point is a 
tri-critical point |l8| corresponding to a 6 field theory with a Landau- Ginzburg functional; the necessary stabilizing 
3-body term is provided by the self- avoidance constraint. 

When fa > 0, the attractive 3-body term 2faN% competes with the corresponding steric repulsion. At the mean 
field level [|l4j, with increasing fa, a tri-critical line departs from the 0-point at fa — 0. The tri-critical line ends at 
a multi-(tetra-)critical point (fa nl ,fa m ), when the 3-body attractive interaction exactly balances the steric repulsion. 
This corresponds to a 4> 8 Landau-Ginzburg theory, since the necessary stabilizing 4-body term is provided by the 
self- avoidance constraint. Increasing fa further, the transition line becomes a coexistence line between the swollen 
and the compact phase. This phase diagram is qualitatively sketched in Figure 1. 

At zero temperature, when the entropy is negligible with respect to the energy, we can give rigorous results for the 
asymptotic behavior of the coexistence line. If a = fa/ fa is fixed and fa — > oo, we can rewrite the Hamiltonian 

- PH cft = fa 5>z, + z>) = -Nfc\ +faJ2{^ + z i) 2 ■ ( 23 ) 

i i 

Since 0< < 2(d— 1), for a>— 2(d— 1), the ground state is at z% = 2(d — 1) for any i, and the walk is swollen. 
On the other hand, for a<— 2{d— 1), the ground state is at = for any i, and the walk is maximally compact. 
For a — — 2(d— 1) (i.e. Pi ~ —2(d—l)fa), the energy of the two competing ground states is the same, and there is 
phase coexistence. 

The presence of a multi-critical point, if not rigorously proved, is numerically established, as one can see from 
Figure 2, where the order parameter A2(fa,fa) is plotted as a function of fa for different values of fa. 

Camacho and Schanke |19[] , using exact enumerations, have obtained a phase diagram which exhibits similar features 
as our. A transition line between the swollen and the collapsed phase is present, which is first order at low temperatures, 
and becomes second order at higher temperatures through a multi-critical point. However, they treat the quenched 
case and describe a slightly different model (i.e. the HP-model). A translation of this model in terms of hydrophobic 
charges would introduce an extra interaction term which depends on the product of the charges, and is absent in our 
model. 



V. TRANSFER MATRIX 



We have addressed the numerical study of the lattice model defined by (Q), by the transfer matrix technique on 
a two-dimensional square lattice. With this method it is possible to consider infinite polymers on a lattice of finite 
width (strip) [po|-p2|. The price to pay is the uncertain extrapolation of the thermodynamic limit, caused by the 
limited width of the strip that we can achieve. 

In a grand-canonical context, the generalized two point correlation function is defined as 



Q(x,r,fa 



x exp 



jV=l W N 



fa z i + p* J2 ■ 



(24) 



where x is the step fugacity and the second sum runs over the SAW of N steps which connect the origin with an 
arbitrary point at distance r. We have neglected the dependence on fa because it only affects a simple rescaling of 
fugacity. The two-point correlation function decreases exponentially in r at long distances, if x is less than the critical 
fugacity x c (fa,fa). This defines the correlation length ^(x, fa, fa): 



Q(x,r,fa,fa) ~ exp 



t{x,fa,fa) 



(25) 



The correlation length £„(x, fa, fa) can be calculated exactly on a lattice strip of infinite length and finite width n, 
with the TM method. The idea is to write recursion relations between a strip of length r and a strip of length r+1. 
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We consider a walk on the strip, which goes from the left to the right, and we cut the strip at column r. The local 
configuration at r is then given by the set of occupied sites of column r and how these are connected to each other 
by the part of the walk at the left of r. Since the interaction /? 2 J2i z 1 gi ves r i se to effective 3-body interactions, it 
is necessary to define the local configurations at stage r taking the lattice bonds occupied by the walk between the 
columns r— 2 and r into account. 

We combine all the possible local configurations i at column r, with all the possible local configurations j at column 
r+1. They yield a non-zero TM element if it is possible to connect them, without producing disconnected pieces, 
and Tij is given by 



= x tij exp 

a=l 



(26) 



where ty is the number of occupied bonds between columns r— 1 and r, and z£> is the number of non occupied 
nearest- neighbor sites of the site at row a and column r— 1, if this is occupied by the walk (see Figure 3). 

The number of possible configurations, and therefore the computational effort, can be strongly reduced by consid- 
ering periodic boundary conditions (the strip becomes a cylinder) and then by exploiting all the symmetry properties 
of the strip. Furthermore, periodic boundary conditions reduce the finite size effects. In this way, within reasonable 
time, we are able to study strip widths up to n = 6, corresponding to 5387 configurations and 154149 non-zero matrix 
elements. 

The correlation function can be expressed in terms of the trace of the r-th power of the TM T: 

g(x,r, fc, fo) ~ Tr T r , (27) 
and the correlation length ( p5|) is related to the largest eigenvalue A™ ax (x, Pi, P2) of T, for a strip of width n: 

^ (x '^ ft) = -lnrtA,ft) ' (28) 
The critical fugacity x™ is determined by the value at which the correlation length diverges, i.e. 

\™ ax (x n c ,0i,f3 2 ) = l. (29) 

The computation of the free energy per monomer f = — /((3N) and of any other quantity of physical interest 

(e.g. the mean number of monomer-solvent contacts A2), is now straightforward in terms of the critical fugacity: 

The thermal exponent v, which characterizes the divergence of the correlation length at the critical fugacity: £ ~ 
(x c — x)~", is a good indicator of a collapse transition. A SAW in two dimensions has the value v = 3/4 in the swollen 
phase 23 1, v = 1/2 in the collapsed phase and v = 4/7 on the tri-critical line p4| , p5[ . 

In order to compute the thermal exponent, we first calculate the density p n (0i, 02) of monomers in the strip 

Pn(Pl,P2) = « • (31) 

n ox 

Then, we use a phenomenological renormalization (PR) group procedure p6| , p7| to obtain finite size estimates for the 
thermal exponent; the underlying hypothesis is the finite size scaling behavior |28|] of the correlation length for u>l 
and (x c — x) <Cl: 

Z n (x,/3 1 ,fo)=ng[n 1 / v (x c -x),0i,f3 2 \ , (32) 
where g is a scaling function. Using the single strip critical fugacity estimate (|29|) leads to 



ln(p n /p n _i) 



-1 



^■-( ■„ ( ;v(„-;» +2 j • < 33 > 

Note that we compare the derivative of the correlation length at criticality for two consecutive strip widths, but 
criticality is determined in a different way for different widths. 
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This is not the most accurate way of applying the ideas of the PR. In fact, the critical fugacity can be determined 
for two consecutive widths at once by 

£ n (^"-Mi,/3 2 ) = gn-iQi:?-"- 1 , (34) 
n n — 1 

This estimate is better than (|2^) and the thermal exponent can easily be obtained 

-•»-'= - -m„/(1-i» * M ■ < 35 ' 



Nevertheless, we have used the rougher formulae (|29|) and (p3|), because solving (|34|) numerically, is a much harder 
task, especially in proximity of coexistence. 



VI. RESULTS 

We will show that the trajectory that the system follows in the ((3%, /3 2 )-plane, for the different models with decreasing 
temperature, only depends on the fraction Ao /A = A c ff . The position on the trajectory at a given temperature, however, 
does depend on Ao and A separately. Although the trajectories can not be calculated analytically, we give a general 
qualitative picture, which will be confirmed by the numerical data. 

The maximum strip width avalaible, n = 6, is rather small. Nevertheless, the data in the various plots already 
show a nice convergence, and we think that the obtained results are reliable. Moreover, we are mainly interested 
in qualitative features of the phase diagram and not in precise quantitative values of critical exponents or transition 
temperatures. 

A. Simple Annealing (ao) 

After elimination of the temperature in (||), the locus of the trajectory in the (/3i, /3 2 )-plane is given by the equation 

5ao (/3 1 ,/3 2 ) = /3 2 -^|^ = 0, (36) 

which describes a parabola. As the temperature is positive, the ((3\ > 0)-branch of this parabola has to be considered, 
for A c ff >0, while the (/3i < 0)-branch is relevant for A c ff <0. 

Hence, at sufficiently low temperature, the chain will be always swollen, no matter how strongly hydrophobic A e ff is 
(i.e. A ff < 0). We note here that the opposite result, i.e. even highly hydrophilic chains are compact at sufficiently low 
temperature, has been found by Garel et al. in the corresponding continuum model, due to improper consideration 
of the incompressibility condition. This condition is automatically accounted for in the definition of our lattice model. 

The typical behaviour for a strongly hydrophobic chain (A e ff <C — 1) is as follows. At high temperatures, it will 
be swollen for entropic reasons. Then with decreasing temperature, it will undergo a 2nd order ^-transition from 
swollen to collapsed. Finally, at even lower temperature, it will undergo a 1st order collapsed to swollen transition. 
We present a numerical evidence of this remarkable re-entrant behaviour in Figure 4: the crossings of the various 
n-estimates of the thermal exponent are typical of 0-point Jl6| and are just around the value vg ~ 0.57, and the 
jump of the order parameter A 2 = Zi/N) provides strong evidence for the first order transition from the compact 
to the swollen phase; the value of the compact phase being A 2 ~ 0, and in the swollen phase A 2 ~ 2. We have 
thus shown that considering 3-body interaction does not change the universality class of the ^-transition, as long as 
one is referring to the tri-critical line. Although not surprising, this result is not trivial in two dimensions. A more 
interesting theoretical question concerns the value of the thermal exponent in d = 2 at the multi-critical point, but 
the TM approach employed here, is uneffective because of the limited strip width we are able to study. 

We note here that we have considered the overall hydrophilicity A to be constant. The chemical reactions that 
give rise to annealed hydrophilicities, however, may be temperature dependent and may cause Ao to vary with 
temperature p9| . Nevertheless, the re-entrant behavior is a quite robust feature: even when Ao diverges exponentially 
to — oo (Ao < 0) with a rate a > 

A (/3) = A5exp(a/3) , (37) 
re-entrant behavior is still observed for a not too big. 
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B. Fixing the Mean (ai) 



After elimination of the temperature in (11), the locus of the trajectory in the (/?i,/?2)-plane is given by the equation 

gaAPufc) =0i- v^Acff + 2ft A a (ft, ft) - , (38) 
which has to be combined with condition ( |l2|) (fixing the mean) 

AaGfl) - A 9 (A,A) = i^^liM = __ ( >> ) (/3l ^ 2 ) . (39) 




First, we show that equation ( p8[ ) defines a unique trajectory /?i (/?2)- It is easy to see that dg ai /d0i > 0, V/3i, 02- 
Hence, we can apply the implicit function theorem, but only if 02 < 02 ■, when g ai (0i,02) is a continuous function of 
its arguments. For 02 > 0™, 9a 1 (fii,02) (in particular Aa), is discontinuous on the coexistence line 0f(02)- Since the 
discontinuity is developed in the thermodynamic limit, and since dg ai /d0\ > for any finite N, the only possibility 
at coexistence is g sw = g ai (0i i 0i°{02),02) > g c = g ai {Pi T 0f (jh),02) (the chain is collapsed for X < 0l°(0 2 ) 
and swollen otherwise). If g sw and g c have the same sign, ( |38| ) is still uniquely satisfied far away from coexistence. 
Instead, if g sw > and g c < 0, (|3^) can only be satisfied on the coexistence line. In this case, a fraction f c of the chain 
is collapsed and the remaining fraction 1 — f c is swollen, so that physical mean values are mixtures of the pure phase 
values: 

A 2 (/3) = f c A 2>c (/3) + (l-/ c ) A 2 , sw (0) . (40) 

Thus, equation (38) becomes a condition on f c : 

fc9c + (l-fc) 9sw =0 . (41) 

We now prove that at zero temperature (02 — * oo) the only way to satisfy (^) for the chain, is to be at the coexistence 
line with f c = h, i-e. half collapsed and half swollen, independently of the value of A e ff, as far it is kept fixed. For 
0x J a ~ 02 — ► oo, fl38| ) becomes 

a + 2A 2 (oo) = 0. (42) 

As we have seen at the end of section pV|, for a>— 2 the chain is swollen, and A2.su, (oo) = 2, but ( ]42] ) implies a = —4 
which is a contradiction. Similarly for a<— 2, the chain is compact and A2 ;C (oo) = 0, implying a = 0. Hence, the only 
remaining possibility is a — —2, i.e. coexistence of the swollen and the compact phase: 

A 2 (oo) = / C A 2 , C + (WcJAa,™ , (43) 

where f c is the collapsed fraction of the chain at coexistence. Plugging this in (^) , yields f c = \ , such that A2 (00) = 1 . 

The phase separation already occurs at finite temperature, since condition ( p^ ) implies that A 2 (/?i,/32) is a con- 
tinuous function along the trajectory, and since the only way to reach the value A 2 (oo) = 1 continuously, is to move 
along the coexistence line. 

All this, in combination with the numerical data, gives the following qualitative scenario of what happens lowering 
the temperature: 

• There exists a particular value X m < such that for A e ff = A m the trajectory passes through the multi-critical 
point, and then follows the coexistence line. 

• For A ff > A m , the trajectory hits the coexistence line coming from the swollen phase, and this will happen further 
away from the multi-critical point the larger A e ff — A m is. Then, it follows the coexistence line, and the collapsed 
fraction (/ c ) of the chain steadily increases to become | at zero temperature. 

• For A ff < A m , the trajectory hits the coexistence line coming from the compact phase. This means that the 
chain first collapses with a 2nd order ^-transition, before the trajectory hits the coexistence line. Then, it follows the 
coexistence line, and the collapsed fraction (f c ) of the chain steadily decreases to become \ at zero temperature. 

This qualitative scenario is confirmed by the numerical evidence shown in Figures 5 and 6. They respectively show 
the numerical results for the trajectory in the (0\, /?2)-plane, and the variation of the order parameter A 2 (oo) with 
temperature. 
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C. Fixing the Mean and the Variance (02) 



Using (17) and (|16|), we obtain A/3 = (fa + 2A 2 /3 2 )/A fj = X > 0, and the locus of the trajectory in the (Pi, fa) 
plane is given by the following equation 



where A 2 (/3) is defined as in (|39|), and 

A 3 (fa = A 3 (fa,fa) 



X 



1 1 



a2 (fa,fa) = £---J~ + x 2 (A 



2/3 



v2^ _ 



2 V 4 



i flta ^( A ,A)_i/ E A WiiA) 



<9/? 2 



(44) 



(45) 



The qualitative behaviour of the trajectories in the phase plane is very similar to that of the previous subsection. The 
collapsed fraction f c of the chain, however, does depend on A e ff at zero temperature. In order to show this, we repeat 
the same argument as before. For fa /a = fa — > 00, condition (44) simplifies to 



a + 2 ^A 2 (oo) - A cfl y A 3 (oo) - Af(oc)] = , 

which is the analogous of equation d42]). For a> — 2 the chain is swollen and A 2stu (oo) = 2, A 3 stu (oo) = 
implies a = —4 which is a contradiction. For a < —2 the chain is collapsed, and A 2 c (oo) = A 3c (oo) 
a — 0. Again, we conclude that the chain is at coexistence at zero temperature, but in this case using 
analogous formula for A3 (00), we get the following condition for the collapsed fraction f c of the chain: 



Jc — 9 



A, 



off 



(46) 

4, but @ 
= leads to 
4|) and the 



(47) 



In Figure 7 the numerical results are shown for the variation of the order parameter A 2 with temperature. The 
analogous of Figure 6 with the trajectories in the (fa, fa)-pl&ne turns out to be indistinguishable from Figure 6 itself, 
and is therefore not shown. 

The behaviour of the chain seems qualitatively unchanged adding the constraint on the variance with respect to 
the fixed mean case. It can easily be verified that for Ao = equations (p8|) and (44), defining the trajectories in the 
(fa, /3 2 )-plane, are equal, and they are qualitatively very similar for Ao ^ 0. If we compare the free energies ( |l3| ) and 
(|l9|), however, taking the proper values of /3o into account, we find that the constraint on the variance is crucial for 
the low temperature behaviour of the free energy. In the fixed mean case (like in the simple annealed case), the free 
energy diverges linearly to —00 with fa whereas fixing also the variance yields a finite free energy The divergence 
can easily be understood. A fraction (all, in the simple annealed case) of the monomers want to be as hydrophilic as 
possible and to maximize their solvent contacts, in order to minimize the energy, while the other fraction has to be 
very hydrophobic to keep the mean finite. For entropical reasons the fractions are exactly ^. This is illustrated in 
Figure 8, where the free energies are compared for the various cases for the same values of Aq and A. 



D. Towards the Quenched Average 

So far, we have only fixed overall moments of the type A|, I € IN. In this way, we do not impose the Aj 
to be independent variables. Or equivalently, even if we fix all the overall moments, the Ai still have the complete 
freedom to rearrange themselves along the chain. Hence, we can assume that fixing both mean and variance may 
be a good approximation for a hetero-polymer, whose hydrophobicities are fixed, but are allowed to migrate. In a 
protein, however, not only the hydrophilicities, but also the positions along the chain are fixed. This corresponds to 
the quenched case. 

In order to get a reasonable approximation by means of annealed averages, one should also ensure the independence 
of the Xi. A first try might be to impose e.g. (J2 i A^A^+i) /N — A§, but after performing the average, one discovers 
immediately that this would involve a coupling between all the Zj, which, obviously, can not be done by the TM 
method. Instead as a first approach, one could start with 
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exp(-i^(A i -A )M i)b (A )b -Ao)+^A i /3fe-/ l )-log(P({A i })))) , 

\ j,k j II 

r ' 2n dq exp(iq(j-k)) 



M 



jk 



la 27r (§+dcos(q)) 

After performing the average, this results in the following expression (up to constants) 



(48) 



ex p( T 5> J " k)M ^ {zk ~ h)+ Ao/3 ^ (Zj ~ h) + ^[ log 

j;k j 



-d 2 



M 



We have introduced the Lagrange multipliers h, s and d, which combined fix 
and I V , u XjMjkXu) = Xl V. t M, 



X 



NX , 



Erf 



(49) 



7V(A^ + A 2 ), 



J — -^o Sj>fe Afjfe, which ensures the independence of a linear combination of the Xj. We expect 
that fixing the latter, may already qualitatively describe the quenched case very well. The only reason we did not 
do the numerics of this case, is of a purely practical nature. In order to calculate (^2 i ZiZi+i), we have to consider 
configurations on 3 colums instead of on 2. This increases the size of the transfer matrix so drastically that we would 
have to limit ourselves to very narrow strip widths. Furthermore, we have an extra self consistency equation (i.e. for 
d) to solve numerically. All this makes it unfeasible (in terms of CPU time) for us at the moment, to perform this 
calculation for a reasonable strip width (i.e. > 4). 

Nevertheless, one may anticipate that some of the typical behaviour found for annealed averages, should not be 
present for the quenched case. The re-entrant behaviour at intermediate temperatures is due to the competition 
between the configurational entropy and the energy on the one hand, and the entropy of the A; distribution on the 
other hand. In the quenched case, the entropy of the Xi distribution is absent, and hence re-entrant behaviour, if 
present at all, can not have its origin there. The phase separation (in low dimensions) is due to the possibility for the 
monomers to rearrange and to form a hydrophobic compact core. Since this is not possible for the quenched polymer, 
we do not expect macroscopic phase separation in that case. Instead, microscopic phase separation seems to play 
an important role for quenched sequences PJlCj], However, one might expect the quenched polymer to behave as an 
effective homo-polymer, where the groundstate is either swollen or compact, depending on the value A e ff. 



VII. DISCUSSION OF RESULTS 



We have studied a simple lattice model for a random hydrophobic- hydrophilic chain in a solvent, with a Gaussian 
distribution for the hydrophilicities. We have considered the case of annealed disorder, without constraints, and with 
constraints on the first and second moments of the overall hydrophobicity. 

We have obtained both exact analytical results (mainly at T = 0), and numerical ones, employing the transfer 
matrix technique on a 2d square lattice. One may ask whether the 2d results are relevant in 3d too. For example, in 
the random sequence model with charge product interaction, a simple mean field argument shows that d = 2 is a very 
peculiar case jiO) . For the considered model, analytical results at the mean field level do not show any qualitative 
difference between different spatial dimensions, and our exact results at T — exhibit the same qualitative behavior 
for any d > 1. Hence, we believe the TM results in 2d to be meaningful in any dimension d > 1. 

We now discuss our results and compare them with the ones obtained by Garel et al. |t]] in the corresponding 
continuum model. The main result of j?J is the fact that the annealed and quenched cases are very similar. They find 
that, at sufficiently low temperature, the polymer is always collapsed, even for hydrophilic chains Ao > 0. Depending 
on the average degree of hydrophobicity, the transition to the collapsed phase is either first or second order. 

• For the simple annealed case (ao), we have shown that for any hydrophobicity Ao > the chain is always swollen, 
while for any A < 0, the chain is swollen at sufficiently low temperature. Using transfer matrix techniques, we 
have found that, for sufficiently negative Ao, a temperature interval (T l5 r 2 ) exists, where the chain is collapsed. 
Coming from the high temperature region the chain undergoes a standard 2nd order 0-point transition at Ti (in the 
same universality class as homo-polymers p5|). Lowering the temperature further, the chain undergoes a 1st order 
(re-entrant) transition at T\ towards the swollen phase. 
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Hence, in the annealed case, we come to the opposite conclusion for the low temperature behavior as predicted by 
Garel et al. 0. Nevertheless, if one takes the incompressibility of the monomer- solvent system properly into account 
(i.e. putting an upper bound on the monomer density), it is possible to recover the same qualitative picture in the 
continuum model of 0, too. 

• In the annealed case with fixing the mean (ai), we have found that, for any A c ff , there is coexistence of the swollen 
and the collapsed phase (phase separation) at sufficiently low temperature, and the chain is exactly half collapsed 
and half swollen at T = 0. For A c ff < A m , a temperature interval (Ti^Tz) exists where the chain is collapsed. At Ti 
the chain undergoes a 2nd order 0-transition from the swollen phase, while at T\ a fraction of the chain swells, and 
lowering the temperature, the swollen fraction steadily increases to the T = value h. For A e g > A m a temperature 
T\ exists above which the chain is swollen. At T\ a fraction of the chain collapses, and lowering the temperature, the 
collapsed fraction of the chain increases steadily to the T = value \ . 



As already noted in section EI], the expressions for the quenched case of JjJ are exactly the same as the ones we 
obtain for the continuum model in the case (ai) . Using a one-parameter Gaussian trial wave function for the monomer 
density, Garel et al. find a collapsed phase for any A c ff at low temperature. Inspired by the observation that they 
in fact describe the case (a±), and by the phase separation observed in our lattice model, we tried a one-parameter 
trial function with a hydrophobic compact core (fraction f c ) and hydrophilic swollen tails (phase separation). In the 
low temperature limit, we recover the result f c = i, and obtain a free energy that is considerably lower than the one 
obtained using the trial function of jjj. 

Note that the free energy diverges linearly to — oo for T — > 0, as in the annealed case (see Figure 8). The same kind 
of divergence appears in the quenched free energy computed in Q , but the quenched free energy should not diverge 
at zero temperature. 



In the annealed case with fixed mean and variance (02), we have found very similar results as for the case (cti) 



The main differences are that the collapsed fraction of the chain at T = depends on A c fr (47), and that the free 
energy does not diverge at T — 0. We repeated these calculations for the continuum model, and again we find that 
at T — the phase separation trial function yields a finite groundstate energy lower than the one obtained with a 
Gaussian trial function, and the collapsed fraction of the chain (/ c ) is found to be exactly ([47|). 

We conclude that, on the one hand we have good evidence that lattice and continuum models exhibit the same 
qualitative behavior, if constrained annealing is considered. On the other hand, a good equilibrium description in the 
case of quenched disorder for this model seems to be lacking at present, and carrying on the constrained annealing 
approximation procedure (e.g. by fixing correlations between different hydrophilicities, as explained in the previous 
section) may be one way to address the problem of quenched disorder. 
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Figure Captions 



Fig. 1. Qualitative phase diagram in the (/3i,/?2) plane: the solid line is the tri-critical 0-line which ends in the 
multi-critical point; the dashed line is the coexistence line. 

Fig. 2. Mean number of monomer-solvent close contacts A2 $2) at varying j3\ for different fixed values of 
/?2, with strip width from 2 to 6. The compact-to-swollen transition is continuous for 02 < fi™ and first order for 
02 > with the (very raw) estimate (3™ ~ 0.75. 

Fig. 3. Example of a transfer matrix element. Empty circles are solvent molecules and dashed lines show the 
nearest-neighbor monomer-solvent contacts. Configuration i is defined at column r and takes into account how the 
walk steps back to column r — 2 (solid line in (6)). Configuration j is defined at column r + 1 and takes into account 
how the walk steps back to column r — 1, and thus, partially overlaps with configuration i. The dotted line in (b) 
shows the non overlapping part of configuration j. In this example the quantities needed for the computation of the 
matrix elements ( ]2q ) arc tij = 5 and — (0,0,2,0,2,0) (the sites of column r — 1 are ordered from the top to the 
bottom of the strip). 

Fig. 4. Mean number of monomer-solvent close contacts A2, n and thermal exponent v n ,n-i at varying temperature, 
with strip width n from 2 to 6, in the case of unconstrained annealing with Ao = — 1 and A = 0.7. Evidence is provided 
for a second order swollen-to-compact ^-transition (see the crossings of different n-estimates of the thermal exponent 
around the 9- value vq ~ 0.57), and for a first order compact-to-swollen transition (see the abrupt jump of the order 
parameter A2,„). The thermal exponent strongly fluctuates at the first order transition due to the phenomenological 
renormalization method employed for its calculation. 

Fig. 5. Mean number of monomer-solvent close contacts A2, n at varying temperature, with strip width n from 2 
to 5, in the fixed mean case. The behavior of the order parameter is the same (swollen at high temperature and then 
^-transition to the collapsed phase) as in the annealed case (see Figure 3) until coexistence starts at j3 ~ 1.7 and A2 ; „ 
starts increasing slowly; the asymptotic ((3 — > 00) value is A2 = 1. The behavior of the thermal exponent is the same 
as in the annealed case until coexistence starts. At coexistence the thermal exponent should be the same as in the 
swollen phase, as far as a finite fraction of the chain is swollen, but due to limited numerical precision we get higly 
fluctuating values. 

Fig. 6. Trajectories in the (/3i,/32) plane for different values of A e s in the fixed mean case, with strip width n = 5. 
The transition line has been located at the crossing of two consecutive n-estimates of A 2 (see Figure 2). 



12 



Fig. 7. As in Figure 5, in the fixed mean and variance case. The asymptotic ((3 — ► oo) value is A2 ~ 0.18 (see 
equation @). 

Fig. 8. Free energies f ao , f ai and / tt2 , of the three considered annealed cases, at varying temperature, with the 
same A ff, and strip width n = 5. While f aa and f ai diverge linearly to —00 as f3 — + 00, f a2 is constant in the same 
limit. 
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